Multi-Class Image Classification Model for American Sign Language Alphabet Using TensorFlow Take 3¶

David Lowe¶

November 9, 2022¶

SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The American Sign Language Alphabet Dataset is a multi-class classification situation where we attempt to predict one of several (more than two) possible outcomes.

INTRODUCTION: The dataset contains 2533 images of lemons on concrete surfaces. The research team collected these images to investigate the possibilities of enforcing a fruit quality control system. The images can be broken down into three different labels: good quality, bad quality, and empty background.

ANALYSIS: The ResNet50V2 model's performance achieved an accuracy score of 98.45% after five epochs using the training dataset. When we applied the model to the validation dataset, the model achieved an accuracy score of 84.54%.

CONCLUSION: In this iteration, the TensorFlow ResNet50V2 CNN model appeared suitable for modeling this dataset.

Dataset ML Model: Multi-Class classification with numerical features

Dataset Used: ASL(American Sign Language) Alphabet Dataset

Dataset Reference: https://www.kaggle.com/datasets/debashishsau/aslamerican-sign-language-aplhabet-dataset

One source of potential performance benchmarks: https://www.kaggle.com/datasets/debashishsau/aslamerican-sign-language-aplhabet-dataset/code

Task 1 - Prepare Environment¶

In [1]:
# Retrieve CPU information from the system
ncpu = !nproc
print("The number of available CPUs is:", ncpu[0])
The number of available CPUs is: 2
In [2]:
# Retrieve memory configuration information
from psutil import virtual_memory
ram_gb = virtual_memory().total / 1e9
print('Your runtime has {:.1f} gigabytes of available RAM\n'.format(ram_gb))
Your runtime has 13.6 gigabytes of available RAM

In [3]:
# Retrieve GPU configuration information
gpu_info = !nvidia-smi
gpu_info = '\n'.join(gpu_info)
print(gpu_info)
Sat Nov  5 13:51:29 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   70C    P8    13W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

1.a) Load libraries and modules¶

In [4]:
# Set the random seed number for reproducible results
RNG_SEED = 888
In [5]:
import random
random.seed(RNG_SEED)
import numpy as np
np.random.seed(RNG_SEED)
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import os
import sys
import math
# import boto3
import zipfile
from datetime import datetime
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score

import tensorflow as tf
tf.random.set_seed(RNG_SEED)
from tensorflow import keras
from tensorflow.keras.callbacks import ReduceLROnPlateau
from tensorflow.keras.preprocessing.image import ImageDataGenerator

1.b) Set up the controlling parameters and functions¶

In [6]:
# Begin the timer for the script processing
START_TIME_SCRIPT = datetime.now()
In [7]:
# Set up the number of CPU cores available for multi-thread processing
N_JOBS = 1

# Set up the flag to stop sending progress emails (setting to True will send status emails!)
NOTIFY_STATUS = False

# Set the percentage sizes for splitting the dataset
TEST_SET_RATIO = 0.2
VAL_SET_RATIO = 0.2

# Set the number of folds for cross validation
N_FOLDS = 3
N_ITERATIONS = 1

# Set various default modeling parameters
DEFAULT_LOSS = 'categorical_crossentropy'
DEFAULT_METRICS = ['accuracy']
DEFAULT_OPTIMIZER = tf.keras.optimizers.Adam(learning_rate=0.0001)
CLASSIFIER_ACTIVATION = 'softmax'
MAX_EPOCHS = 3
BATCH_SIZE = 32
# CLASS_LABELS = []
# CLASS_NAMES = []
# RAW_IMAGE_SIZE = (250, 250)
TARGET_IMAGE_SIZE = (224, 224)
INPUT_IMAGE_SHAPE = (TARGET_IMAGE_SIZE[0], TARGET_IMAGE_SIZE[1], 3)

# Define the labels to use for graphing the data
TRAIN_METRIC = "accuracy"
VALIDATION_METRIC = "val_accuracy"
TRAIN_LOSS = "loss"
VALIDATION_LOSS = "val_loss"

# Define the directory locations and file names
STAGING_DIR = 'staging/'
TRAIN_DIR = 'staging/ASL_Alphabet_Dataset/asl_alphabet_train'
# VALID_DIR = ''
TEST_DIR = 'staging/ASL_Alphabet_Dataset/asl_alphabet_test'
TRAIN_DATASET = 'archive.zip'
# VALID_DATASET = ''
# TEST_DATASET = ''
# TRAIN_LABELS = ''
# VALID_LABELS = ''
# TEST_LABELS = ''
# OUTPUT_DIR = 'staging/'
# SAMPLE_SUBMISSION_CSV = 'sample_submission.csv'
# FINAL_SUBMISSION_CSV = 'submission.csv'

# Check the number of GPUs accessible through TensorFlow
print('Num GPUs Available:', len(tf.config.list_physical_devices('GPU')))

# Print out the TensorFlow version for confirmation
print('TensorFlow version:', tf.__version__)
Num GPUs Available: 1
TensorFlow version: 2.9.2
In [8]:
# Set up the email notification function
def status_notify(msg_text):
    access_key = os.environ.get('SNS_ACCESS_KEY')
    secret_key = os.environ.get('SNS_SECRET_KEY')
    aws_region = os.environ.get('SNS_AWS_REGION')
    topic_arn = os.environ.get('SNS_TOPIC_ARN')
    if (access_key is None) or (secret_key is None) or (aws_region is None):
        sys.exit("Incomplete notification setup info. Script Processing Aborted!!!")
    sns = boto3.client('sns', aws_access_key_id=access_key, aws_secret_access_key=secret_key, region_name=aws_region)
    response = sns.publish(TopicArn=topic_arn, Message=msg_text)
    if response['ResponseMetadata']['HTTPStatusCode'] != 200 :
        print('Status notification not OK with HTTP status code:', response['ResponseMetadata']['HTTPStatusCode'])
In [9]:
if NOTIFY_STATUS: status_notify('(TensorFlow Multi-Class) Task 1 - Prepare Environment completed on ' + datetime.now().strftime('%A %B %d, %Y %I:%M:%S %p'))

Task 2 - Load and Prepare Images¶

In [10]:
if NOTIFY_STATUS: status_notify('(TensorFlow Multi-Class) Task 2 - Load and Prepare Images has begun on ' + datetime.now().strftime('%A %B %d, %Y %I:%M:%S %p'))
In [11]:
# Clean up the old files and download directories before receiving new ones
!rm -rf staging/

if not os.path.exists(TRAIN_DATASET):
    !wget https://dainesanalytics.com/datasets/kaggle-debashishsau-sign-language-aplhabet/archive.zip

zip_ref = zipfile.ZipFile(TRAIN_DATASET, 'r')
zip_ref.extractall(STAGING_DIR)
zip_ref.close()
In [12]:
CLASS_LABELS = os.listdir(TRAIN_DIR)
print(CLASS_LABELS)
NUM_CLASSES = len(CLASS_LABELS)
print('Total number of classes detected:', NUM_CLASSES)
['H', 'T', 'S', 'R', 'V', 'N', 'F', 'space', 'W', 'D', 'Q', 'X', 'J', 'M', 'I', 'B', 'E', 'nothing', 'Y', 'C', 'U', 'A', 'G', 'P', 'O', 'Z', 'del', 'L', 'K']
Total number of classes detected: 29
In [13]:
# Brief listing of training image files for each class
for c_label in CLASS_LABELS:
    training_class_dir = os.path.join(TRAIN_DIR, c_label)
    training_class_files = os.listdir(training_class_dir)
    print('Number of training images for', c_label, ':', len(os.listdir(training_class_dir)))
    print('Training samples for', c_label, ':', training_class_files[:5],'\n')
Number of training images for H : 7906
Training samples for H : ['H2899.jpg', 'H (3702).jpg', 'H (399).jpg', 'H (2657).jpg', 'H26.jpg'] 

Number of training images for T : 8054
Training samples for T : ['T21.jpg', 'T2027.jpg', 't_40_rotate_9.jpeg', 'T (2531).jpg', 't_34_rotate_7.jpeg'] 

Number of training images for S : 8109
Training samples for S : ['S (1687).jpg', 'S (3113).jpg', 's_46_rotate_8.jpeg', 's_14_rotate_5.jpeg', 'S2780.jpg'] 

Number of training images for R : 8021
Training samples for R : ['R2942.jpg', 'R2948.jpg', 'R (1118).jpg', 'r_7_rotate_8.jpeg', 'R (2692).jpg'] 

Number of training images for V : 7597
Training samples for V : ['V (928).jpg', 'V (3496).jpg', 'V (3505).jpg', 'V (2662).jpg', 'V (2228).jpg'] 

Number of training images for N : 7932
Training samples for N : ['N1300.jpg', 'N (3434).jpg', 'n_55_rotate_5.jpeg', 'N (3453).jpg', 'N1317.jpg'] 

Number of training images for F : 8031
Training samples for F : ['F (912).jpg', 'F2219.jpg', 'F (3179).jpg', 'F (1555).jpg', 'F (3089).jpg'] 

Number of training images for space : 7071
Training samples for space : ['space2050.jpg', 'space2560.jpg', 'space (2029).jpg', 'space831.jpg', 'space (2691).jpg'] 

Number of training images for W : 7787
Training samples for W : ['W352.jpg', 'w_35_rotate_9.jpeg', 'w_53_rotate_9.jpeg', 'W (1762).jpg', 'W (3090).jpg'] 

Number of training images for D : 7629
Training samples for D : ['D (1864).jpg', 'D (2058).jpg', 'D2653.jpg', 'D (202).jpg', 'D2665.jpg'] 

Number of training images for Q : 7954
Training samples for Q : ['Q1212.jpg', 'Q383.jpg', 'Q (619).jpg', 'Q (2662).jpg', 'q_4_rotate_2.jpeg'] 

Number of training images for X : 8093
Training samples for X : ['x_52_rotate_2.jpeg', 'X (349).jpg', 'X (1045).jpg', 'X2470.jpg', '91.jpg'] 

Number of training images for J : 7503
Training samples for J : ['J1472.jpg', 'J (2817).jpg', 'j_53_rotate_8.jpeg', 'J (638).jpg', 'J (3000).jpg'] 

Number of training images for M : 7900
Training samples for M : ['M166.jpg', 'M361.jpg', 'M (1761).jpg', 'm_50_rotate_8.jpeg', 'M1929.jpg'] 

Number of training images for I : 7953
Training samples for I : ['I1290.jpg', 'I (1281).jpg', 'I (3090).jpg', 'I2815.jpg', '91.jpg'] 

Number of training images for B : 8309
Training samples for B : ['B (3114).jpg', 'B (540).jpg', 'B (3019).jpg', '91.jpg', 'B (1659).jpg'] 

Number of training images for E : 7744
Training samples for E : ['e_18_rotate_5.jpeg', 'E (2086).jpg', 'E (373).jpg', 'E (1266).jpg', 'E (1267).jpg'] 

Number of training images for nothing : 3030
Training samples for nothing : ['nothing1317.jpg', 'nothing1916.jpg', 'nothing1454.jpg', 'nothing350.jpg', 'nothing2532.jpg'] 

Number of training images for Y : 8178
Training samples for Y : ['Y (1133).jpg', 'Y (3133).jpg', 'Y337.jpg', 'Y1241.jpg', 'Y (1636).jpg'] 

Number of training images for C : 8146
Training samples for C : ['C (510).jpg', 'C (439).jpg', 'C (3820).jpg', '91.jpg', 'C1602.jpg'] 

Number of training images for U : 8023
Training samples for U : ['U (1760).jpg', 'U (3751).jpg', 'U (2389).jpg', 'U1109.jpg', 'U650.jpg'] 

Number of training images for A : 8458
Training samples for A : ['A (1725).jpg', 'A2510.jpg', 'A (3046).jpg', 'A (2274).jpg', 'A (2428).jpg'] 

Number of training images for G : 7844
Training samples for G : ['G141.jpg', 'g_52_rotate_5.jpeg', 'G1316.jpg', 'G1799.jpg', 'G (592).jpg'] 

Number of training images for P : 7601
Training samples for P : ['p_53_rotate_2.jpeg', 'P1709.jpg', 'P2246.jpg', 'P (3294).jpg', 'P (1054).jpg'] 

Number of training images for O : 8140
Training samples for O : ['O2204.jpg', 'O1913.jpg', 'O1411.jpg', 'O (2872).jpg', 'O (3477).jpg'] 

Number of training images for Z : 7410
Training samples for Z : ['Z (2870).jpg', 'Z (2161).jpg', 'Z2911.jpg', 'Z224.jpg', 'Z (1923).jpg'] 

Number of training images for del : 6836
Training samples for del : ['del (3724).jpg', 'del1239.jpg', 'del1380.jpg', 'del (1332).jpg', 'del (2381).jpg'] 

Number of training images for L : 7939
Training samples for L : ['L (2460).jpg', 'l_52_rotate_4.jpeg', '91.jpg', 'L (2447).jpg', 'L (337).jpg'] 

Number of training images for K : 7876
Training samples for K : ['K1978.jpg', 'K220.jpg', 'K (2452).jpg', 'K (925).jpg', '91.jpg'] 

In [14]:
# Plot some training images from the dataset
nrows = len(CLASS_LABELS)
ncols = 4
training_examples = []
example_labels = []

fig = plt.gcf()
fig.set_size_inches(ncols * 4, nrows * 3)

for c_label in CLASS_LABELS:
    training_class_dir = os.path.join(TRAIN_DIR, c_label)
    training_class_files = os.listdir(training_class_dir)
    for j in range(ncols):
        training_examples.append(training_class_dir + '/' + training_class_files[j])
        example_labels.append(c_label)
    # print(training_examples)
    # print(example_labels)

for i, img_path in enumerate(training_examples):
    # Set up subplot; subplot indices start at 1
    sp = plt.subplot(nrows, ncols, i+1)
    sp.text(0, 0, example_labels[i])
    # sp.axis('Off')
    img = mpimg.imread(img_path)
    plt.imshow(img)
plt.show()
In [15]:
datagen_kwargs = dict(rescale=1./255, validation_split=VAL_SET_RATIO)
training_datagen = ImageDataGenerator(**datagen_kwargs)
validation_datagen = ImageDataGenerator(**datagen_kwargs)
dataflow_kwargs = dict(class_mode="categorical")

do_data_augmentation = True
if do_data_augmentation:
    training_datagen = ImageDataGenerator(rotation_range=45,
                                          horizontal_flip=True,
                                          vertical_flip=True,
                                          **datagen_kwargs)

print('Loading and pre-processing the training images...')
training_generator = training_datagen.flow_from_directory(directory=TRAIN_DIR,
                                                          target_size=TARGET_IMAGE_SIZE,
                                                          batch_size=BATCH_SIZE,
                                                          shuffle=True,
                                                          seed=RNG_SEED,
                                                          subset="training",
                                                          **dataflow_kwargs)
print('Number of training image batches per epoch of modeling:', len(training_generator))

print('Loading and pre-processing the validation images...')
validation_generator = validation_datagen.flow_from_directory(directory=TRAIN_DIR,
                                                              target_size=TARGET_IMAGE_SIZE,
                                                              batch_size=BATCH_SIZE,
                                                              shuffle=False,
                                                              subset="validation",
                                                              **dataflow_kwargs)
print('Number of validation image batches per epoch of modeling:', len(validation_generator))
Loading and pre-processing the training images...
Found 178472 images belonging to 29 classes.
Number of training image batches per epoch of modeling: 5578
Loading and pre-processing the validation images...
Found 44602 images belonging to 29 classes.
Number of validation image batches per epoch of modeling: 1394
In [16]:
if NOTIFY_STATUS: status_notify('(TensorFlow Multi-Class) Task 2 - Load and Prepare Images completed on ' + datetime.now().strftime('%A %B %d, %Y %I:%M:%S %p'))

Task 3 - Define and Train Models¶

In [17]:
if NOTIFY_STATUS: status_notify('(TensorFlow Multi-Class) Task 3 - Define and Train Models has begun on ' + datetime.now().strftime('%A %B %d, %Y %I:%M:%S %p'))
In [18]:
# Define the function for plotting training results for comparison
def plot_metrics(history):
    fig, axs = plt.subplots(1, 2, figsize=(24, 15))
    metrics =  [TRAIN_LOSS, TRAIN_METRIC]
    for n, metric in enumerate(metrics):
        name = metric.replace("_"," ").capitalize()
        plt.subplot(2,2,n+1)
        plt.plot(history.epoch, history.history[metric], color='blue', label='Train')
        plt.plot(history.epoch, history.history['val_'+metric], color='red', linestyle="--", label='Val')
        plt.xlabel('Epoch')
        plt.ylabel(name)
        if metric == TRAIN_LOSS:
            plt.ylim([0, plt.ylim()[1]])
        else:
            plt.ylim([0, 1])
        plt.legend()
In [19]:
# Define the baseline model for benchmarking
def create_nn_model(input_param=INPUT_IMAGE_SHAPE, output_param=NUM_CLASSES, dense_nodes=2048,
                    classifier_activation=CLASSIFIER_ACTIVATION, loss_param=DEFAULT_LOSS,
                    opt_param=DEFAULT_OPTIMIZER, metrics_param=DEFAULT_METRICS):
    base_model = keras.applications.resnet_v2.ResNet50V2(include_top=False, weights='imagenet', input_shape=input_param)
    nn_model = keras.models.Sequential()
    nn_model.add(base_model)
    nn_model.add(keras.layers.Flatten())
    nn_model.add(keras.layers.Dense(dense_nodes, activation='relu')),
    nn_model.add(keras.layers.Dense(output_param, activation=classifier_activation))
    nn_model.compile(loss=loss_param, optimizer=opt_param, metrics=metrics_param)
    return nn_model
In [20]:
# Initialize the neural network model and get the training results for plotting graph
start_time_module = datetime.now()
tf.keras.utils.set_random_seed(RNG_SEED)
baseline_model = create_nn_model()
baseline_model_history = baseline_model.fit(training_generator,
                                            epochs=MAX_EPOCHS,
                                            validation_data=validation_generator,
                                            verbose=1)
print('Total time for model fitting:', (datetime.now() - start_time_module))
Epoch 1/3
5578/5578 [==============================] - 2929s 522ms/step - loss: 0.2467 - accuracy: 0.9275 - val_loss: 0.7239 - val_accuracy: 0.8386
Epoch 2/3
5578/5578 [==============================] - 2822s 506ms/step - loss: 0.0793 - accuracy: 0.9769 - val_loss: 0.8724 - val_accuracy: 0.8453
Epoch 3/3
5578/5578 [==============================] - 2807s 503ms/step - loss: 0.0523 - accuracy: 0.9845 - val_loss: 0.9525 - val_accuracy: 0.8454
Total time for model fitting: 2:23:02.535579
In [21]:
baseline_model.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 resnet50v2 (Functional)     (None, 7, 7, 2048)        23564800  
                                                                 
 flatten (Flatten)           (None, 100352)            0         
                                                                 
 dense (Dense)               (None, 2048)              205522944 
                                                                 
 dense_1 (Dense)             (None, 29)                59421     
                                                                 
=================================================================
Total params: 229,147,165
Trainable params: 229,101,725
Non-trainable params: 45,440
_________________________________________________________________
In [22]:
plot_metrics(baseline_model_history)
In [23]:
if NOTIFY_STATUS: status_notify('(TensorFlow Multi-Class) Task 3 - Define and Train Models completed on ' + datetime.now().strftime('%A %B %d, %Y %I:%M:%S %p'))

Task 4 - Tune and Optimize Models¶

In [24]:
# if NOTIFY_STATUS: status_notify('(TensorFlow Multi-Class) Task 4 - Tune and Optimize Models has begun on ' + datetime.now().strftime('%A %B %d, %Y %I:%M:%S %p'))
In [25]:
# Not applicable for this iteration of modeling
In [26]:
# if NOTIFY_STATUS: status_notify('(TensorFlow Multi-Class) Task 4 - Tune and Optimize Models completed on ' + datetime.now().strftime('%A %B %d, %Y %I:%M:%S %p'))

Task 5 - Finalize Model and Make Predictions¶

In [27]:
# if NOTIFY_STATUS: status_notify('(TensorFlow Multi-Class) Task 5 - Finalize Model and Make Predictions has begun on ' + datetime.now().strftime('%A %B %d, %Y %I:%M:%S %p'))
In [28]:
# Not applicable for this iteration of modeling
In [29]:
# if NOTIFY_STATUS: status_notify('(TensorFlow Multi-Class) Task 5 - Finalize Model and Make Predictions completed on ' + datetime.now().strftime('%A %B %d, %Y %I:%M:%S %p'))
In [30]:
print ('Total time for the script:',(datetime.now() - START_TIME_SCRIPT))
Total time for the script: 2:24:45.918830